Ancestral risk modification for multiple sclerosis susceptibility detected across the Major Histocompatibility Complex in a multi-ethnic population

The Major Histocompatibility Complex (MHC) makes the largest genetic contribution to multiple sclerosis (MS) susceptibility, with 32 independent effects across the region explaining 20% of the heritability in European populations. Variation is high across populations with allele frequency differences and population-specific risk alleles identified. We sought to identify MHC-specific MS susceptibility variants and assess the effect of ancestral risk modification within 2652 Latinx and Hispanic individuals as well as 2435 Black and African American individuals. We have identified several novel susceptibility alleles which are rare in European populations including HLA-B*53:01, and we have utilized the differing linkage disequilibrium patterns inherent to these populations to identify an independent role for HLA-DRB1*15:01 and HLA-DQB1*06:02 on MS risk. We found a decrease in Native American ancestry in MS cases vs controls across the MHC, peaking near the previously identified MICB locus with a decrease of ~5.5% in Hispanics and ~0.4% in African Americans. We have identified several susceptibility variants, including within the MICB gene region, which show global ancestry risk modification and indicate ancestral differences which may be due in part to correlated environmental factors. We have also identified several susceptibility variants for which MS risk is modified by local ancestry and indicate true ancestral genetic differences; including HLA-DQB1*06:02 for which MS risk for European allele carriers is almost two times the risk for African allele carriers. These results validate the importance of investigating MS susceptibility at an ancestral level and offer insight into the epidemiology of MS phenotypic diversity.


Introduction
The Major Histocompatibility Complex (MHC), located on chromosome 6p21.3, is vital to proper immune system function due to its central role in the initiation of adaptive immune response. Faulty reactions may result in destruction of normal tissue and manifest as an autoimmune disease such as multiple sclerosis (MS); a neurodegenerative disease characterized by transmigration of peripheral autoreactive leukocytes into the central nervous system. The MHC makes the single largest genetic contribution to MS susceptibility in whites of European ancestry, on its own explaining~20% of the heritability estimated from genotyped SNPs [1] and was first identified as a determinant of MS risk in the 1970s by utilizing lymphocyte cultures [2] and lymphocytotoxic antisera reactions [3]. The most extensively studied and replicated association has been seen with HLA-DRB1 � 15:01, demonstrating the strongest genetic effect in European [1] and African American individuals [4]. Early genome-wide association studies (GWAS) in populations of European ancestry confirmed the effect of HLA-DRB1 � 15:01 (risk, class II) and identified HLA-A � 02:01 (protective, class I), HLA-DRB1 � 03:01 (risk, class II), and HLA-DRB1 � 13:03 (risk, class II) as susceptibility loci within the MHC [5]. Subsequent analyses extended this list further only in populations of European ancestry [1,6,7]. Presently 32 statistically independent additive and dominant effects across class I, class II, and non-HLA genes within the MHC complex have been identified [1]. Prior evidence has also pointed to the effect of interactions on MS risk; both epistatic interactions between MHC alleles [8] and environmental interactions with MHC alleles [9], highlighting the complexity of the role of the MHC on disease susceptibility.
The MHC is both polygenic (containing a variety of genes with a range of binding specificity) and highly polymorphic (containing multiple variants within each gene), making antigen evasion difficult. While being polygenic allows an individual to respond to a wide array of antigens (i.e. ensuring survival of an individual), polymorphism ensures the capture of antigens at a species-wide level (i.e. ensuring survival of a species) [10]. While this variability is crucial from a biological standpoint, the long-range linkage disequilibrium (LD) and extensive allelic heterogeneity inherent to the region have made refining the MS associated risk signal to the underlying causal variants a difficult endeavor. MHC variation across ancestral populations is also high, with numerous population-specific alleles and allele frequency differences noted [10]. Therefore, the 32 susceptibility alleles identified in populations of predominantly European ancestry may not represent the most relevant susceptibility alleles across the MHC in all ancestrally distinct populations. In fact, a study in 2004 demonstrated that the HLA-DRB � 15:03 allele, which occurs almost exclusively in individuals of African descent, confers moderate risk to MS in African American individuals but no measurable risk in other populations [11,12]. Previous studies in ancestrally diverse Latin American populations, across various countries of origin, were small in size and demonstrated inconclusive results [13][14][15][16][17][18]. While prevalence of MS has traditionally been considered lower in Latinx and Hispanic individuals and Black and African American individuals than in individuals of European ancestry; epidemiological evidence now indicates that prevalence may be more similar between populations than previously indicated, and clinical manifestations are diverse [19]. A detailed investigation of this region in ancestrally admixed individuals formed through interbreeding of populations, including Latinx and Hispanic individuals as well as Black and African American individuals, could serve to both uncover the MHC risk in these underrepresented populations and more definitely chronicle the ancestral lineage of MHC haplotypes.
Our objective is therefore to test for the association of genetic variation across the extended MHC (chromosome 6 from 29-34 MB) with MS in a multi-ethnic cohort of self-identified Latinx and Hispanic individuals (collectively we refer to them as Hispanic) and Black and African American individuals (collectively we refer to them as African American) and then to assess the effect of ancestral allele origin on risk.

Study population
2995 self-reported Hispanic individuals (1558 with MS, 1437 controls) and 2630 self-reported African American individuals (1427 with MS, 1203 controls) were ascertained from seven US participating institutions as part of the Alliance for Research in Hispanic MS (ARHMS) Consortium. Following all quality control, a total of 2652 unrelated Hispanics (1298 MS cases and 1354 controls) and 2435 unrelated African Americans (1298 MS cases and 1137 controls) remained for analysis. A detailed review of the quality control process and a breakdown of samples by ascertainment site have been previously described [20]. The institutional review boards at each institution approved this study, and all participants provided written informed consent prior to participation.
We additionally obtained DNA on 498 European (EUR), 379 African (AFR), 180 East Asian (EAS), and 299 Central and South-American Hispanic (AMR) samples from the 1000 Genomes Project [21] to be used for quality control. A total of 436 EUR, 318 AFR, 160 EAS, and 262 AMR remained after exclusion of individuals with low call rate (�98%), excess autosomal heterozygosity �3 SD from the mean, and excess identity by descent signifying a sample duplication or relatedness (proportion IBD > 0.2).

Genotype calling and SNP quality control
All DNA were obtained through whole blood extraction. DNA were genotyped on the MS Chip, an Illumina Infinium custom genotyping array which contains targeted and dense coverage of the extended MHC, specifically designed for imputation [1,20]. All sample genotyping was conducted by the Center for Genome Technology within the John P. Hussman Institute for Human Genomics at the University of Miami. Genotype calling was done using GenomeStudio 2.0, and manual review was done for all 17,963 variants across the extended MHC.
Variants were excluded within Hispanics, African Americans, and each 1000 Genome population separately based on poor performing clusters, low call rate (CR) with respect to minor allele frequency (MAF): CR � 99.5% when MAF � 5%, CR � 99% when 5% � MAF � 10%, and CR � 98% when MAF > 10%, and discordance between plate controls (one genotyping control on each 96-well plate). Within the Hispanic and African American study samples, variants were additionally excluded if they were out of Hardy-Weinberg equilibrium (chi-square p � 1.00 x 10 −05 in disease controls) or were differentially missing between those with MS and controls (p � 1.00 x 10 −03 ). In total, a common set of 9909 SNPs remained across the MHC in the Hispanic and African American study sample following all quality control; 8856 in EUR; 8859 in AFR; 8848 in EAS, and 8877 in AMR.

Imputation and accuracy assessment of classical HLA alleles
Classical HLA alleles, SNPs, and amino acid residues were imputed simultaneously from genotyped SNPs using HLA-TAPAS and a multi-ethnic reference panel of 2504 samples from the  1000 Genome Project (503 EUR, 661 AFR, 347 AMR, 504 EAS, 489 SAS) for each of the Hispanic and African American samples and for a subset of the genotyped 1000 Genomes Project samples which were non-overlapping with the multi-ethnic reference panel (100 AFR, 100 AMR, 100 EUR, 153 EAS) [22]. Additionally, using the same multi-ethnic reference panel, HLA-TAPAS imputation was performed on Illumina Multi-Ethnic Genotyping Array (MEGA) data for 40 unrelated Native American (NAM) samples from the Human Genome Diversity Project (HGDP) which was provided to us through the PAGE Consortium [23]. These data included individuals from the Surui and Karitiana populations of Brazil, Maya population of Mexico, and a native population of Columbia. At the variant level; alleles, SNPs, and amino acid residues imputed with allelic R-squared less than 0.4 were removed from further analysis. In addition, individual genotypes with an estimated genotype probability (GP) of less than 0.8 were zeroed, and subsequently variants with call rate less than 95% were removed.
To assess the validity of HLA-TAPAS imputation, publicly available classical HLA types produced by colleagues at the University of California, San Francisco [24] and hosted by the International Genome Sample Resource were downloaded for the subset of 1000 Genomes Project samples which we also imputed. HLA typing methodology for the 1000 Genomes samples has been described in detail previously [24]. Concordance between the HLA types produced from classical Sanger sequencing and the HLA types imputed from HLA-TAPAS for individuals from AFR, AMR, EUR, and EAS was assessed to determine imputation reliability in diverse populations. In addition, HLA types from Sanger sequencing were available for 615 of our African American cases, and imputation accuracy was assessed in the same way (typing methodology indicated in S1 Table). Specifically, within the African American and 1000 Genomes Project samples with Sanger validated allele calls; rate of quality imputation was first assessed as the number of alleles which were successfully imputed and passed quality control (i.e. quality alleles) divided by the number of possible imputed alleles. Concordance rates were then assessed as the number of quality alleles that were concordant with the Sanger allele calls divided by the total number of quality alleles.

Ancestry estimation
To evaluate local ancestry, a phased set of 20,048 genotyped or imputed MHC variants which were non-monomorphic and passed all quality control in Hispanic, African American, and HGDP Native American samples were generated with HLA-TAPAS. Local ancestry was then assessed in Hispanic and African American samples using RFMIXV2 [25] with phased reference panels of Native Americans (40 HGDP), Europeans (40 CEU 1000G), and Africans (40 YRI 1000G). Equal sized reference panels were chosen to ensure no bias was introduced into the estimation of local ancestry. Local ancestry was evaluated as the number of African (AFR), European (EUR), and Native American (NAM) haplotypes seen at each variant position (0, 1, or 2 for an individual). Global ancestry was evaluated as the proportion of ancestry from each reference population with ADMIXTURE [26] using the same reference panel and a set of genome-wide tagging SNPs consisting of 10,928 independent non-MHC SNPs (R 2 � 0.2) which were not within 1-Mb of any previously identified MS risk variant [1], and passed quality control in the Hispanic, African American, and HGDP NAM samples.

Analysis of classical HLA alleles and SNPs
Marginal association between MS status and each of the classical HLA alleles, SNPs, and amino acid residues (collectively termed as variation within the MHC) was assessed using logistic regression in the Hispanic and African American study sample, adjusting for global ancestry proportions to account for population substructure.
Forward stepwise conditional logistic regression was used to identify statistically independent effects. The primary classical allele was included as a covariate, and the association analysis was repeated for the remaining classical alleles; allowing for additive, dominant, and recessive effects at each step. This process was performed again until no classical alleles reached the minimum suggestive level of significance (p-value <1 × 10 −03 ). Following the inclusion of all qualified classical alleles, the process was continued to allow for inclusion of SNPs or amino acid residues to a more stringent threshold of p-value <1 × 10 −04 ; again allowing for additive, dominant, and recessive effects.

Assessing the effect of sex on MS risk
To determine whether sex modifies the effect of variation on MS risk, we used a logistic regression model to evaluate the effect of an interaction between sex and each independent risk variant using the likelihood ratio test (LRT). A full model (Eq 1) was compared to a reduced model without the interaction term; where G EUR and G NAM represented the global European and Native American ancestry components and Allele represented the number of copies (0, 1, or 2) of each independent risk allele.
When a sex interaction was observed (p < 0.05), sex stratified analyses were performed.

Assessing the effect of ancestry on MS risk
To determine the contribution of local ancestry to MS risk across the MHC, we used the following model (Eq 2); where L EUR and L NAM represent the local ancestry components. The association of local ancestry with MS risk was assessed using a LRT to compare the full model to a restricted model which excluded local ancestry.
To determine whether ancestry modifies the effect of genetic variation on MS risk, we used two logistic regression-based models. The first model evaluated the role of global ancestry on risk variation by evaluating the effect of an interaction between global ancestry and each independent risk variant using the LRT. A full model (Eq 3) was compared to a reduced model without the interaction terms.
When a global ancestry interaction was observed (p < 0.05), a graphical model was used to illustrate the variant-specific ancestral differences in MS risk by first computing the estimated odds of risk using varying scenarios of global ancestry proportion and copies of the MS associated allele. Specifically, within the Hispanic sample the estimated odds of MS (Eq 3) were computed by holding the proportion of AFR ancestry constant at the average of 11% observed in our sample and varying the NAM ancestry proportion using the maximum, mean, and minimum observed in our sample (82%, 15%, and 0% respectively). In this way, estimated odds of risk were computed for each of 0, 1, and 2 copies of the variant allele under three different ancestral scenarios and plotted linearly to illustrate changes in effect. Similarly, within the African American sample, the estimated odds of MS were computed by holding the proportion of NAM constant at the average of 2% observed in our sample and varying the AFR ancestry proportion using the observed maximum, mean, and minimum (98%, 78%, and 16% respectively). The second model evaluated the role of local ancestry on risk variation by evaluating the effect of an interaction between local ancestry and each independent risk variant. Again, a full model (Eq 4) was compared to a reduced model without the interaction terms; where L STATE represented the local ancestry state at the corresponding variant position (EUR � EUR; EUR � AFR; EUR � NA; AFR � AFR; AFR � NA; NA � NA as the six possible ancestral states for each phased allele).
When a local ancestry interaction was observed (p < 0.05), a haplotype model was used to assess MS risk in an allele-specific manner. By testing for presence of an allele from a certain ancestral background against absence of the allele on the same ancestral background, we were able to estimate effect sizes for ancestry-specific haplotypes. Specifically, each allele was first grouped by local ancestry state and then within each ancestral group, the effect of the allele on MS risk was assessed using generalized estimating equations (GEE). Individual was used as the grouping variant, and adjustment for global ancestry was made. A global ancestry interaction in the absence of a local ancestry interaction would be taken as evidence that the ancestral interaction may be due to highly correlated environmental factors rather than to genetic ancestry alone.

Distribution of ancestry
Following all quality control, a total of 2652 Hispanics (1298 MS cases and 1354 controls) and 2435 African Americans (1298 MS cases and 1137 controls) remained for analysis. As previously described, Hispanic MS cases and controls in this dataset are on average 74% European, 15% Native American, and 11% African [20]. There is slightly less European ancestry and slightly more Native American ancestry in individuals with MS compared to controls (71% vs 76% respectively for European, p = 3.59 x 10 −12 ; 18% vs 13% respectively for Native American, p = 3.61 x 10 −16 ). Geographical differences in ancestry are observed; with a greater proportion of Native American ancestry identified in Hispanics residing on the west coast (39% vs 11%) and a greater proportion of European ancestry in Hispanics residing on the east coast of the United States (79% vs 54%); in accordance with relocations of native populations that occurred with European colonization of the Americas [27]. The African Americans on the other hand, have a non-zero proportion of Native American ancestry and are on average 78% African, 20% European, and 2% Native American, with similar proportions in cases and controls and across geographical regions, as previously described [20].

Imputation accuracy of HLA alleles
While imputation accuracy varied by allele and by population, concordance rates remained high following our stringent quality filters. On average across populations, quality rates were lowest for HLA-DRB1 (59%) and highest for HLA-C (94%), and concordance rates were lowest for HLA-DRB1 (93%) and highest for HLA-DQB1 (99%) (S1 Table). For 1000 Genomes Africans (AFR), concordance rates ranged from 92% for HLA-DRB1 to 98% for HLA-B, with an average of 95% across alleles. For 1000 Genomes Project Hispanics (AMR), concordance rates ranged from 95% for HLA-DRB1 to 99% for HLA-DQB1, with an average of 98% across alleles.
Most notably, concordance rates also exceeded 90% across all alleles in our African American case sample, ranging from 90% for HLA-DRB1 to~100% for HLA-DQB1 (S1 Table).

Sex modification of MS risk
One of the twelve independent MHC variants identified for association with MS in the Hispanic sample demonstrated risk modification by sex (Table 3). In a marginal analysis, females with HLA-DQB1 � 02:01 were at significant risk for MS (OR = 1.24, p = 1.72 x 10 −02 ), while no significant effect was observed for males ( Table 4). Two of the six independent MHC variants identified for association with MS in the African American sample demonstrated risk modification by sex; HLA-A � 02:01 and rs2516423 (Table 3). A highly significant protective effect for HLA-A � 02:01 was observed in females (OR = 0.63, p = 1.21 x 10 −05 ), while no effect was seen in males (Table 4). Conversely, a highly protective effect for rs2516423 was observed in males (OR = 0.55, p = 1.82 x 10 −03 ), while no effect was seen in females (Table 4). We found no significant difference in risk for females and males with HLA-DRB1 � 15:01 (interaction p > 0.05 in both Hispanics and African Americans, Table 3); whereas it has been suggested in other studies that HLA-DRB1 � 1501 is more prevalent in females [28,29] and that females may confer a higher HLA-DRB1 � 15:01 specific risk [30].

Local ancestry across the MHC
Within the Hispanic sample, we found an association of local ancestry with MS risk across the entire MHC from 29 to 34 MB (LRT p < 0.05), with ancestral differences between MS cases  Fig 3A). In accordance with previous reports in the literature of MHC-specific admixture-enabled selection due to rapid adaptive evolution [31], relative to global ancestry, we observed an increase in local African ancestry and a decrease in local European ancestry for both MS cases and controls across the entirety of the MHC ( Fig  3B). However, the magnitude of the divergence of local ancestry from global ancestry differs by MS status. Relative to controls, MS cases had an increase in both EUR and AFR ancestry (average increase of 4.5% EUR and 1% AFR in MS cases relative to controls), in conjunction with a substantial decrease in NAM ancestry (average decrease of 5.5% in MS cases relative to controls, LRT p < 0.01 from 31 to 32.6 MB). The same increase in local African ancestry and decrease in local European ancestry relative to global ancestry was observed across the MHC in all African American samples. While the magnitude of divergence did not statistically differ between MS cases and controls (LRT p > 0.05), a similar pattern as had been seen in the Hispanic sample was seen beginning between the Class I and Class III gene region and extending into Class II; where relative to controls, MS cases had an increase in both EUR and AFR ancestry, in conjunction with a decrease in NAM ancestry (Fig 3C). Given the low levels of NAM ancestry in the population (~2% in MS cases and controls), the average decrease in MS cases relative to controls was modest (only~0.4% across this region from~31 to 32.6 MB as compared to 5.5% in Hispanics); nonetheless, the pattern is strikingly consistent.

Global ancestry modification of MS risk
Two of the twelve independent MHC variants identified for association with MS in the Hispanic sample (Table 1) demonstrated global ancestry risk modification (p < 0.05) ( Table 3) To investigate if the degree of MS risk attributed to the variant allele was dependent upon global ancestry composition, we estimated the odds of MS under various ancestral scenarios. For rs2844503, we first considered an individual with no NAM ancestry, and we estimated odds of MS to be 0.29, 0.32, and 0.35 for each of 0, 1, or 2 observed variant alleles utilizing the parameter estimates obtained in Eq (3). We then considered an individual with 15% NAM ancestry (the mean observed in our sample) and estimated odds of MS to be 0.38, 0.51, and 0.58 for each of 0, 1, or 2 observed variant alleles. Conversely, the estimated odds of MS increased exponentially with each allelic copy for an individual with 82% NAM ancestry (the maximum observed in our sample); with estimated odds of MS being 1.32, 4.25, and 13.67 for 0, 1, or 2 observed variant alleles. Thus, we conclude that variation in rs2844503 presents a risk for MS primarily in individuals with a high proportion of NAM ancestry (Fig 4A). While individual global ancestry drastically alters the effect of rs2844503 on MS risk, we found that individual local ancestry within the specified genomic region did not demonstrate risk modification (Table 3).
For rs3021302, we found that estimated MS risk decreased recessively for an individual with 82% NAM ancestry. The estimated odds of MS was 6.82 for zero or one variant allele copy and 2.58 for two copies. Conversely, estimated MS risk increased recessively for an individual with the no NAM ancestry (estimated odds of MS was 0.54 for zero or one allele copy  .06 for two copies) and 15% NAM ancestry (estimated odds of MS was 0.86 for zero or one allele copy and 2.97 for two copies). We conclude that variation in rs3021302 presents a risk for MS primarily in individuals with a low proportion of NAM ancestry and confers a protective effect in individuals with a high proportion of NAM ancestry (Fig 4B). An interaction between local ancestry and rs3021302 was also observed ( Table 3). One of the six independent MHC variants identified for association with MS in the African American sample (Table 2) demonstrated global ancestry risk modification (Table 3); rs760145 (additive OR = 0.75 in the full sample), an intronic variant in HLA-F-AS1, which also represents a novel signal in African Americans. The estimated risk of MS decreased exponentially with each allelic copy for an individual with the 16% AFR ancestry (the minimum observed in our sample), with estimated odds of MS being 5.88, 1.54, and 0.40 for 0, 1, and 2 observed variant alleles (Fig 5). A more moderate decrease in risk was seen for an individual with 78% AFR ancestry (the mean observed in our sample), with estimated odds of MS being 1.57, 1.17, and 0.87 for 0, 1, and 2 observed variant alleles. In contrast, a slight increase was observed for an individual with 98% AFR ancestry (the maximum observed in our sample), local ancestry relative to global ancestry is observed in both MS cases and controls across the entire region, consistent with admixture-enabled selection [31]. On average, Hispanic controls exhibit a greater decrease in European local ancestry relative to global ancestry than MS cases, in conjunction with a marked increase in Native American local ancestry. Differences peak between the class I and class III gene region, corresponding to the increase in -log10P denoted in (A). C. Average percent change in local vs global ancestry for African American cases and controls. Consistent with Hispanics, we observe an increase in African local ancestry and a decrease in European local ancestry relative to global ancestry in MS cases and controls across the entire region. The changes are of consistent magnitude in MS cases and controls, except for~31 to~32.6 MB between the class I and class II gene regions, where we see on average that African American controls exhibit a lesser increase in African local ancestry relative to global ancestry than MS cases, again in conjunction with a marked increase in Native American local ancestry (although -log10P does not reach a threshold for significance).
https://doi.org/10.1371/journal.pone.0279132.g003 with estimated odds of MS being 1.02, 1.07, and 1.12 for 0, 1, and 2 observed variant alleles. We conclude that the protective effect of variation in rs760145 occurs in individuals of low AFR ancestry, with minimal effect seen for those with high levels of AFR ancestry. There was no significant interaction between local ancestry and rs760145 (Table 3).

Local ancestry modification of MS risk
Local ancestry risk modification (p < 0.05) was observed for four of the MS associated classical alleles (HLA-A � 02:01, HLA-B � 53:01, HLA-DRB1 � 15:01, and HLA-DQB1 � 06:02) and two SNPs (rs6929950 and rs3021302) ( Table 3). While statistical significance of the interaction between local ancestry and allele / SNP may have only been observed in either the Hispanic or African American samples, apart from HLA-DQB1 � 06:02 where interaction was observed in both (Table 3), a haplotype model was used to assess allele specific association, stratified by ancestral allele origin in both samples to determine consistency in direction of ancestral effect (Table 5). Given the recessive nature of rs3021302, a haplotypic model was not applied.
For HLA-A � 02:01, while the significant local ancestry interaction was observed in the African American sample (p = 4.14 x10 -02 ), both the Hispanic and the African American sample demonstrate a stronger protective effect on MS for ancestral EUR alleles than for AFR alleles (EUR OR = 0.63, AFR OR = 0.80 in Hispanics; EUR OR = 0.56, AFR OR = 0.77 in African Americans). Interestingly, there are a greater number of NAM case alleles than control alleles in both samples; although the difference is non-significant, indicating a risk effect for HLA-A � 02:01 alleles of NAM ancestral descent (Table 5).
For HLA-B � 53:01, a local ancestry interaction was observed in Hispanics (p = 4.37 x 10 −02 ) and nominally in African Americans (p = 7.02 x 10 −02 ) ( Table 3). While HLA-B � 53:01 demonstrated a significant protective effect in the overall African American sample (Table 2), no association was seen with MS in the overall Hispanic sample (p = 1.12 x 10 −01 ). Yet, within the Hispanic sample, a nominal protective effect was seen for HLA-B � 53:01 alleles of AFR descent  (Table 5). Although the majority of HLA-B � 53:01 alleles are of AFR descent, a non-zero number of EUR and NAM alleles were observed in both Hispanics and African Americans; consistent with expected frequencies from reference populations (S4 Table). In both samples, more EUR case alleles were observed than EUR control alleles (3 vs 2 in Hispanics and 1 vs 0 in African Americans).
Lastly, a significant local-ancestry interaction was identified in the African American sample for rs6929950, an intronic variant within OR5V1 (Table 5). Although this variant was identified as a novel protective variant for MS in the Hispanic sample (Table 1), marginal association was also observed in the African American sample (OR = 0.82, p = 4.97 x 10 −02 ). In the African American sample, most observed alleles are of AFR descent (OR = 0.84, p = 9.44 x 10 −02 ). Only 12 observed alleles are of EUR descent (6 in MS cases and 6 in controls, demonstrating no effect), and 5 observed alleles are of NAM descent (all 5 in controls, demonstrating a highly protective effect). The small non-AFR sample size and lack of effect observed for EUR alleles is likely driving the observed local ancestry interaction, but larger samples would be needed to determine validity. Conversely, no real difference in effect is seen between EUR (OR = 0.37) and AFR (OR = 0.44) ancestral alleles in Hispanics.

Discussion
In this large MS multiethnic cohort we have identified an independent contribution of HLA-DRB1 � 15:01 and HLA-DQB1 � 06:02 to MS risk in both the Hispanic and African American sample. We have additionally identified a striking decrease in NAM ancestry in cases relative to controls across the MHC which can be seen in both Hispanics and African Americans, peaking between the Class I and Class III gene region. We found several MS susceptibility variants to have an effect that is modified by global ancestry; indicating ancestral differences which may be due in part to correlated socio-economic or environmental factors, and we have also identified MS susceptibility variants which have an effect that is modified by local ancestry, indicating true genetic differences in the degree of risk/protection exerted across ancestral backgrounds. We have discovered several novel susceptibility variants, and confirmed robust replication (p < 1 × 10 −03 ) for six classical alleles in Hispanics and two classical alleles in African Americans which had been previously identified in Europeans. Despite our Hispanic and African American study samples being similar in size, we have observed substantially more replication of previously identified alleles in Hispanics at the specified significance level. This contrasts with another study of the MHC in Hispanics and African Americans conducted by Chi et al. which reported association with MS at the same significance level for only HLA-DRB1 � 15:01 in Hispanics [12] and HLA-DRB1 � 15:01 and HLA-DRB1 � 03:01 in African Americans. At a broader level, they identified more replication of classical alleles at p < 0.05 in African Americans than Hispanics and concluded that there may be a smaller overlap in MHC specific MS genetic risk between Hispanics and Europeans than that of African Americans and Europeans. This contrast could be due in part to their relatively smaller Hispanic case collection (326 Hispanic MS cases) as well as differences in Hispanic ancestral background between the studies. Chi et al. have reported considerably more Native American (average 34% vs 18% among MS cases) and less European ancestry (average 56% vs 71% among MS cases) than our sample [12].
Using stepwise conditional modeling we found that HLA-DQB1 � 06:02 was significantly associated with MS after conditioning on HLA-DRB1 � 15:01 in African Americans and vice versa in Hispanics. This provides evidence that in admixed populations, HLA-DRB1 � 15:01 and HLA-DQB1 � 06:02 contribute to MS risk in a manner that is independent of one another. Within European populations, HLA-DRB1 � 15:01 is most often found as part of an extended haplotype with HLA-DQB1 � 06:02, and a decade of fine-mapping research has sought to distinguish which is the predisposing allele [32]. One of the largest studies in Europeans concluded that the association signal could be localized to HLA-DRB1 � 15:01 [6]. A similar study in African Americans, containing a small subset of the samples in the current study, also identified HLA-DRB1 � 15:01 as the predominant signal, finding no effect of HLA-DQB1 � 06:02 in the absence of HLA-DRB1 � 15:01; however, the study included only~350 patients with MS and 300 controls, and less than 25 individuals were identified as being HLA-DRB1 � 15:01-and HLA-DQB1 � 06:02+ in either the patient or control cohort [11]. Within our larger study samples, we do however observe an effect of HLA-DQB1 � 06:02 in the absence of HLA-DRB1 � 15:01, in both Hispanics and African Americans (p < 0.05 for both populations, S5 Table), consistent with our determination of independence from the conditional model. Although an independent association for HLA-DRB1 � 15:01 and HLA-DQB1 � 06:02 has been undetectable in Europeans due to high LD, it is possible that a European model for biologically independent contributions also exists. A striking decrease of NAM ancestry in MS Hispanic cases relative to controls across the extended MHC was found. This suggests that protective NAM haplotypes are likely present across the region. These differences peak between the Class I and Class III gene region and are centered on the previously identified MICB/LST1 locus. While the magnitude of this difference is substantially smaller in African Americans, a similar pattern is observed. Further work is needed to understand the role this difference plays in variation in both disease incidence and disease presentation between the two populations. Minimal effort has been made to fine-map the previously identified MICB/LST1 locus in European populations [6], and the signal has not yet been refined. It is possible that Hispanic or indigenous populations with a substantial NAM component may be most advantageous for fine-mapping of the locus. Notably, within this region we also identified three independent signals in Hispanics (with one, rs2844503, demonstrating NAM global ancestry risk modification in the absence of a local ancestry modification) and one in African Americans, all of which were in low LD (R2 � 0.2) with the variants previously identified in European populations. This may indicate that substantial locus heterogeneity is also present within this region, with the additional presence of variants modified by global ancestry indicating that environmental factors may also play a role in the influence of the MICB locus on MS susceptibility.
We identified three novel protective variants for MS across the extended MHC, one in the Hispanic sample (rs6929950, an intronic variant within OR5V1) and two in the African American sample (classical HLA-B � 53:01 and rs760145, an intronic variant within HLA-F-AS1). While rs6929950 was detected at the specified significance threshold (p < 1.00 x 10 −03 ) in Hispanics and demonstrated only nominal association in African Americans (p = 4.97 x 10 −02 ), more than 75% of the variant alleles identified in both samples were of African origin. It is worthwhile to note that local ancestry risk modification was detected for both rs6929950 and HLA-B � 53:01; although in both instances the frequency of non-African alleles detected was low, with �5 for HLA-B � 53:01 and �20 for rs6929950, necessitating that local ancestry interactions be interpreted with caution. Variant rs6929950, located within OR5V1, has regulatory potential, demonstrating a regulomeDB score of (3a) [33] indicative of its location within a transcription factor binding site and DNase peak. Although further refinement would be needed to attribute causality of the variant, OR5V1 represents an important biological candidate for MS susceptibility given the notable olfactory dysfunction among a number of neurodegenerative diseases [34]. HLA-B � 53:01 also constitutes an important novel mechanism for MS protection which may be unique to individuals of African descent. It was the first HLA allele to be associated with resistance to severe malaria [35] and is found in 12% of African individuals while rarely occurring in other populations (S4 Table), suggestive of positive selection.
The third novel variant, intronic rs760145 in HLA-F-AS1 demonstrates differing effect allele frequencies by ancestry, being 0.435, 0.532, and 0.602 in gnomAD Africans, Europeans, and Americas populations respectively (S4 Table). Global ancestry modification is also present, while no local ancestry modification is seen, indicating that these effect differences may represent complex socio-economic or other environmental interactions. While there has been some previous evidence for association of variation within HLA-F-AS1 and MS [36], moderate LD exists between HLA-A:02:01 and rs760145 in Europeans (R2 = 0.2) while LD is negligible in African Americans and may explain why HLA-F-AS1 has not been indicated as an independent MS locus in more recent studies [1].
We did not find significance for the previously identified African allele, HLA-DRB1 � 15:03 at the pre-specified significance level of 1.0 x 10 −04 ; however, after conditioning on the three independent African classical alleles, marginal significance was seen (OR = 1.23, p = 2.53 x 10 −02 ), indicating that HLA-DRB1 � 15:03 does contribute to MS risk within this sample. Marginal significance was eliminated after conditioning on intergenic rs28371315; pointing to the extended haplotypes that exist between DRB1 and DQB1.
Local ancestry risk modification was seen for several well-established MS susceptibility alleles including HLA-DRB1 � 15:01, HLA-DQB1 � 06:02, and HLA-A � 02:01. Given that these were among the first alleles identified in Europeans and have been consistently and robustly replicated, it is perhaps unsurprising that all three alleles demonstrate effects which are largely augmented on the EUR ancestral haplotype. Most notably, for HLA-DQB1 � 06:02, we find MS risk for EUR allele carriers is almost two times the risk for AFR allele carriers. Although HLA-DQB1 � 06:02 did not pass quality control thresholds in their study, a similar pattern for HLA-DRB1 � 15:01 was identified by Chi et al. who determined that risk of MS conferred by the EUR HLA-DRB1 � 15:01 allele was three times higher compared to the AFR HLA-DRB1 � 15:01 allele. In our study, less than 10 non-European HLA-DRB1 � 15:01 alleles were identified in the Hispanic or African American sample. Although a local ancestry interaction was observed, not much inference can be taken from the ancestral stratified analyses.
Our sample represents the largest and most geographically diverse collection of Hispanic and African American individuals with MS to date. However, we acknowledge that our sample size is still limited and thus there are likely additional novel and previously identified MHC associations that have gone undetected in our sample. Similarly, detection of effects which are heterogeneous by sex or ancestral state may be limited, and false positive interactions may remain unidentified. Additionally, there is the possibility that we may be missing an extended haplotype with a causal DRB1 allele which is linked to DQB1 � 06:02 but has gone undetected due to power [37]. Nonetheless, a previous study in the isolated founder population of Sardinia also identified an independent association for the HLA-DRB1 � 15:01 and HLA-DQB1 � 06:02 alleles [38]; a suggestion which has also been supported in transgenic mouse models [39][40][41].
In conclusion, we observe a central role for ancestry in genetic risk modification across the MHC. Both global and local variant-specific ancestral risk modifications have been identified which may influence prevalence or phenotypic differences that have been observed across racial and ethnic groups [42][43][44]. More broadly, we have observed a decrease in Native American ancestry in MS cases relative to controls across much of the extended MHC, most notably among Hispanics, indicating that protective Native American haplotypes are likely present across much of the region. We have identified several MHC-specific MS susceptibility variants in the admixed African American and Hispanic samples which are rare in European populations and represent novel population-specific effects, and we have also utilized the differing LD patterns in Hispanics and African Americans to confirm an independent role for HLA-DRB1 � 15:01 and HLA-DQB1 � 06:02 on MS risk. These results taken together validate the importance of investigating MS susceptibility at an ancestral level in heterogeneous populations to identify population-specific genetic influence on disease risk and offer insight into the epidemiology of MS phenotypic diversity.
The current study focused on imputed genomic array data within the MHC containing rich information in genetic ancestry, population genomics, and MS susceptibility. Future research can integrate additional -omics data; including but not limited to transcriptomics, epigenomics, proteomics, and pharmacogenomics to further decipher the relationship between genetic ancestry and MS susceptibility within the MHC and within other known MS loci [45]; facilitating a comprehensive understanding of precision population health.
Supporting information S1